Afropean: Notes from Black Europe (2019) Johny Pitts Johny Pitts is a photographer and writer who lives in the north of England who set out to explore "black Europe from the street up" those districts within European cities that, although they were once 'white spaces' in the past, they are now occupied by Black people. Unhappy with the framing of the Black experience back home in post-industrial Sheffield, Pitts decided to become a nomad and goes abroad to seek out the sense of belonging he cannot find in post-Brexit Britain, and Afropean details his journey through Paris, Brussels, Lisbon, Berlin, Stockholm and Moscow. However, Pitts isn't just avoiding the polarisation and structural racism embedded in contemporary British life. Rather, he is seeking a kind of super-national community that transcends the reductive and limiting nationalisms of all European countries, most of which have based their national story on a self-serving mix of nostalgia and postcolonial fairy tales. Indeed, the term 'Afropean' is the key to understanding the goal of this captivating memoir. Pitts writes at the beginning of this book that the word wasn't driven only as a response to the crude nativisms of Nigel Farage and Marine Le Pen, but that it:
encouraged me to think of myself as whole and unhyphenated. [ ] Here was a space where blackness was taking part in shaping European identity at large. It suggested the possibility of living in and with more than one idea: Africa and Europe, or, by extension, the Global South and the West, without being mixed-this, half-that or black-other. That being black in Europe didn t necessarily mean being an immigrant.In search of this whole new theory of home, Pitts travels to the infamous banlieue of Clichy-sous-Bois just to the East of Paris, thence to Matong in Brussels, as well as a quick and abortive trip into Moscow and other parallel communities throughout the continent. In these disparate environs, Pitts strikes up countless conversations with regular folk in order to hear their quotidian stories of living, and ultimately to move away from the idea that Black history is defined exclusively by slavery. Indeed, to Pitts, the idea of race is one that ultimately restricts one's humanity; the concept "is often forced to embody and speak for certain ideas, despite the fact it can't ever hold in both hands the full spectrum of a human life and the cultural nuances it creates." It's difficult to do justice to the effectiveness of the conversations Pitts has throughout his travels, but his shrewd attention to demeanour, language, raiment and expression vividly brings alive the people he talks to. Of related interest to fellow Brits as well are the many astute observations and comparisons with Black and working-class British life. The tone shifts quite often throughout this book. There might be an amusing aside one minute, such as the portrait of an African American tourist in Paris to whom "the whole city was a film set, with even its homeless people appearing to him as something oddly picturesque." But the register abruptly changes when he visits Clichy-sous-Bois on an anniversary of important to the area, and an element of genuine danger is introduced when Johny briefly visits Moscow and barely gets out alive. What's especially remarkable about this book is there is a freshness to Pitt s treatment of many well-worn subjects. This can be seen in his account of Belgium under the reign of Leopold II, the history of Portuguese colonialism (actually mostly unknown to me), as well in the way Pitts' own attitude to contemporary anti-fascist movements changes throughout an Antifa march. This chapter was an especial delight, and not only because it underlined just how much of Johny's trip was an inner journey of an author willing have his mind changed. Although Johny travels alone throughout his journey, in the second half of the book, Pitts becomes increasingly accompanied by a number of Black intellectuals by the selective citing of Frantz Fanon and James Baldwin and Caryl Phillips. (Nevertheless, Jonny has also brought his camera for the journey as well, adding a personal touch to this already highly-intimate book.) I suspect that his increasing exercise of Black intellectual writing in the latter half of the book may be because Pitts' hopes about 'Afropean' existence ever becoming a reality are continually dashed and undercut. The unity among potential Afropeans appears more-and-more unrealisable as the narrative unfolds, the various reasons of which Johny explores both prosaically and poetically. Indeed, by the end of the book, it's unclear whether Johny has managed to find what he left the shores of England to find. But his mix of history, sociology and observation of other cultures right on my doorstep was something of a revelation to me.
Orwell's Roses (2021) Rebecca Solnit Orwell s Roses is an alternative journey through the life and afterlife of George Orwell, reimaging his life primarily through the lens of his attentiveness to nature. Yet this framing of the book as an 'alternative' history is only revisionist if we compare it to the usual view of Orwell as a bastion of 'free speech' and English 'common sense' the roses of the title of this book were very much planted by Orwell in his Hertfordshire garden in 1936, and his yearning of nature one was one of the many constants throughout his life. Indeed, Orwell wrote about wildlife and outdoor life whenever he could get away with it, taking pleasure in a blackbird's song and waxing nostalgically about the English countryside in his 1939 novel Coming Up for Air (reviewed yesterday). Solnit has a particular ability to evince unexpected connections between Orwell and the things he was writing about: Joseph Stalin's obsession with forcing lemons to grow in ludicrously cold climates; Orwell s slave-owning ancestors in Jamaica; Jamaica Kincaid's critique of colonialism in the flower garden; and the exploitative rose industry in Colombia that supplies the American market. Solnit introduces all of these new correspondences in a voice that feels like a breath of fresh air after decades of stodgy Orwellania, and without lapsing into a kind of verbal soft-focus. Indeed, the book displays a marked indifference towards the usual (male-centric) Orwell fandom. Her book draws to a close with a rereading of the 'dystopian' Nineteen Eighty-Four that completes her touching portrait of a more optimistic and hopeful Orwell, as well as a reflection on beauty and a manifesto for experiencing joy as an act of resistance.
The Disaster Artist (2013) Greg Sestero & Tom Bissell For those not already in the know, The Room was a 2003 film by director-producer-writer-actor Tommy Wiseau, an inscrutable Polish immigr with an impenetrable background, an idiosyncratic choice of wardrobe and a mysterious large source of income. The film, which centres on a melodramatic love triangle, has since been described by several commentators and publications as one of the worst films ever made. Tommy's production completely bombed at the so-called 'box office' (the release was actually funded entirely by Wiseau personally), but the film slowly became a favourite at cult cinema screenings. Given Tommy's prominent and central role in the film, there was always an inherent cruelty involved in indulging in the spectacle of The Room the audience was laughing because the film was astonishingly bad, of course, but Wiseau infused his film with sincere earnestness that in a heartless twist of irony may be precisely why it is so terrible to begin with. Indeed, it should be stressed that The Room is not simply a 'bad' film, and therefore not worth paying any attention to: it is uncannily bad in a way that makes it eerily compelling to watch. It unintentionally subverts all the rules of filmmaking in a way that captivates the attention. Take this representative example: This thirty-six-second scene showcases almost every problem in The Room: the acting, the lighting, the sound design, the pacing, the dialogue and that this unnecessary scene (which does not advance the plot) even exists in the first place. One problem that the above clip doesn't capture, however, is Tommy's vulnerable ego. (He would later make the potentially conflicting claims that The Room was both an ironic cult success and that he is okay with people interpreting it sincerely). Indeed, the filmmaker's central role as Johnny (along with his Willy-Wonka meets Dracula persona) doesn't strike viewers as yet another vanity project, it actually asks more questions than it answers. Why did Tommy even make this film? What is driving him psychologically? And why and how? is he so spellbinding? On the surface, then, 2013's The Disaster Artist is a book about the making of one the strangest films ever made, written by The Room's co-star Greg Sestero and journalist Tom Bissell. Naturally, you learn some jaw-dropping facts about the production and inspiration of the film, the seed of which was planted when Greg and Tommy went to see an early screening of The Talented Mr Ripley (1999). It turns out that Greg's character in The Room is based on Tommy's idiosyncratic misinterpretation of its plot, extending even to the character's name Mark who, in textbook Tommy style, was taken directly (or at least Tommy believed) from one of Ripley's movie stars: "Mark Damon" [sic]. Almost as absorbing as The Room itself, The Disaster Artist is partly a memoir about Thomas P. Wiseau and his cinematic masterpiece. But it could also be described as a biography about a dysfunctional male relationship and, almost certainly entirely unconsciously, a text about the limitations of hetronormativity. It is this latter element that struck me the most whilst reading this book: if you take a step back for a moment, there is something uniquely sad about Tommy's inability to connect with others, and then, when Wiseau poured his soul into his film people just laughed. Despite the stories about his atrocious behaviour both on and off the film set, there's something deeply tragic about the whole affair. Jean-Luc Godard, who passed away earlier this year, once observed that every fictional film is a documentary of its actors. The Disaster Artist shows that this well-worn aphorism doesn't begin to cover it.
rec-def
library that I have excessively blogged about recently (here, here, here and here). I got quite flattering comments about that talk, so if you want to see if they were sincere, I suggest you watch the recording of Getting recursive definitions off their bottoms (but it s not necessary for the following).
After the talk, Franz Thoma approached me and told me a story of how we was once implementing the game Minesweeper in Haskell, and in particular the part of the logic where, after the user has uncovered a field, the game would automatically uncover all fields that are next to a neutral field, i.e. one with zero adjacent bombs. He was using a comonadic data structure, which makes a context-dependent parallel computation such as uncovering one field quite natural, and was hoping that using a suitable fix-point operator, he can elegantly obtain not just the next step, but directly the result of recursively uncovering all these fields. But, much to his disappointment, that did not work out: Due to the recursion inherent in that definition, a knot-tying fixed-point operator will lead to a cyclic definition.
He was wondering if the rec-def
library could have helped him, and we sat down to find out, and this is the tale of this blog post. I will avoid the comonadic abstractions and program it more naively, though, to not lose too many readers along the way. Maybe read Chris Penner s blog post and Finch s functional pearl Getting a Quick Fix on Comonads if you are curious about that angle.
Array
data type, it s Ix
-based indexing is quite useful for grids:
The library lacks a function to generate an array from a generating function, but it is easy to add:
Let s also fix the size of the board, as a pair of lower and upper bounds (this is the format that the Ix
type class needs):
Now board is simply a grid of boolean values, with True
indicating that a bomb is there:
type Board = Grid Bool board1 :: Board board1 = listArray size [ False, False, False, False , True, False, False, False , True, False, False, False , False, False, False, False ]
*
), we can print the board:
pGrid :: (C -> String) -> String pGrid p = unlines [ concat [ p' (y,x) x <- [lx-1 .. ux+1] ] y <- [ly-1 .. uy+1] ] where ((lx,ly),(ux,uy)) = size p' c inRange size c = p c p' _ = "#" pBombs :: Board -> String pBombs b = pGrid $ \c -> if b ! c then "*" else " "
b ! c
looks up a the coordinate in the array, and is True
when there is a bomb at that coordinate.
So here is our board, with two bombs:
ghci> putStrLn $ pBombs board1
######
# #
#* #
#* #
# #
######
But that s not what we want to show to the user: Every field should have have a number that indicates the number of bombs in the surrounding fields. To that end, we first define a function that takes a coordinate, and returns all adjacent coordinates. This also takes care of the border, using inRange
:
neighbors :: C -> [C]
neighbors (x,y) =
[ c
(dx, dy) <- range ((-1,-1), (1,1))
, (dx, dy) /= (0,0)
, let c = (x + dx, y + dy)
, inRange size c
]
data H = Bomb Hint Int deriving Eq hint :: Board -> C -> H hint b c b ! c = Bomb hint b c = Hint $ sum [ 1 c' <- neighbors c, b ! c' ]
hint :: Board -> C -> H hint b c b ! c = Bomb hint b c = Hint $ sum [ 1 c' <- neighbors c, b ! c' ] pCell :: Board -> C -> String pCell b c = case hint b c of Bomb -> "*" Hint 0 -> " " Hint n -> show n pBoard :: Board -> String pBoard b = pGrid (pCell b)
ghci> putStrLn $ pBoard board1
######
#11 #
#*2 #
#*2 #
#11 #
######
Next we have to add masks: We need to keep track of which fields the user already sees. We again use a grid of booleans, and define a function to print a board with the masked fields hidden behind ?
:
type Mask = Grid Bool mask1 :: Mask mask1 = listArray size [ True, True, True, False , False, False, False, False , False, False, False, False , False, False, False, False ] pMasked :: Board -> Mask -> String pMasked b m = pGrid $ \c -> if m ! c then pCell b c else "?"
ghci> putStrLn $ pMasked board1 mask1
######
#11 ?#
#????#
#????#
#????#
######
solve0 :: Board -> Mask -> Mask solve0 b m0 = m1 where m1 :: Mask m1 = genArray size $ \c -> m0 ! c or [ m0 ! c' c' <- neighbors c, hint b c' == Hint 0 ]
m1
from the old one m0
by the following logic: A field is visible if it was visible before (m0 ! c
), or if any of its neighboring, neutral fields are visible.
This works so far: I uncovered the three fields next to the one neutral visible field:
ghci> putStrLn $ pMasked board1 $ solve0 board1 mask1
######
#11 #
#?2 #
#????#
#????#
######
But that s not quite what we want: We want to keep doing that to uncover all fields.
m0 ! c
), or if any of its neighboring, neutral fields will be visible.
In the code, this is just a single character change: Instead of looking at m0
to see if a neighbor is visible, we look at m1
:
solve1 :: Board -> Mask -> Mask solve1 b m0 = m1 where m1 :: Mask m1 = genArray size $ \c -> m0 ! c or [ m1 ! c' c' <- neighbors c, hint b c' == Hint 0 ]
kfix
comonadic fixed-point operator in his code, I believe.)
Does it work? It seems so:
ghci> putStrLn $ pMasked board1 $ solve1 board1 mask1
######
#11 #
#?2 #
#?2 #
#?1 #
######
Amazing, isn t it!
Unfortunately, it seems to work by accident. If I start with a different mask:
mask2 :: Mask
mask2 = listArray size
[ True, True, False, False
, False, False, False, False
, False, False, False, False
, False, False, False, True
]
ghci> putStrLn $ pMasked board1 mask2
######
#11??#
#????#
#????#
#??? #
######
Then our solve1
function does not work, and just sits there:
ghci> putStrLn $ pMasked board1 $ solve1 board1 mask2
######
#11^CInterrupted.
Why did it work before, but now now?
It fails to work because as the code tries to figure out if a field, it needs to know if the next field will be uncovered. But to figure that out, it needs to know if the present field will be uncovered. With the normal boolean connectives (
and or
), this does not make progress.
It worked with mask1
more or less by accident: None of the fields on in the first column don t have neutral neighbors, so nothing happens there. And for all the fields in the third and forth column, the code will know for sure that they will be uncovered based on their upper neighbors, which come first in the neighbors
list, and due to the short-circuting properties of
, it doesn t have to look at the later cells, and the vicious cycle is avoided.
rec-def
comes in: By using the RBool
type in m1
instead of plain Bool
, the recursive self-reference is not a problem, and it simply works:
import qualified Data.Recursive.Bool as RB solve2 :: Board -> Mask -> Mask solve2 b m0 = fmap RB.get m1 where m1 :: Grid RB.RBool m1 = genArray size $ \c -> RB.mk (m0 ! c) RB. RB.or [ m1 ! c' c' <- neighbors c, hint b c' == Hint 0 ]
m1
; I just replaced Bool
with RBool
,
with RB.
and or
with RB.or
. And used RB.get
at the end to get a normal boolean out. And , here we go:
ghci> putStrLn $ pMasked board1 $ solve2 board1 mask2
######
#11 #
#?2 #
#?2 #
#?1 #
######
That s the end of this repetition of let s look at a tying-the-knot-problem and see how rec-def
helps , which always end up a bit anti-climatic because it just works , at least in these cases. Hope you enjoyed it nevertheless.
Jobs
or CronJobs
).
NodeJS
API with endpoints to upload
files and store them on S3 compatible services that were later accessed via
HTTPS, but the requirements changed and we needed to be able to publish folders
instead of individual files using their original names and apply access
restrictions using our API.
Thinking about our requirements the use of a regular filesystem to keep the
files and folders was a good option, as uploading and serving files is simple.
For the upload I decided to use the sftp protocol, mainly because I already
had an sftp container image based on
mysecureshell prepared; once
we settled on that we added sftp support to the API server and configured it
to upload the files to our server instead of using S3 buckets.
To publish the files we added a nginx container configured
to work as a reverse proxy that uses the
ngx_http_auth_request_module
to validate access to the files (the sub request is configurable, in our
deployment we have configured it to call our API to check if the user can
access a given URL).
Finally we added a third container when we needed to execute some tasks
directly on the filesystem (using kubectl exec
with the existing containers
did not seem a good idea, as that is not supported by CronJobs
objects, for
example).
The solution we found avoiding the NIH Syndrome (i.e. write our own tool) was
to use the webhook tool to provide the
endpoints to call the scripts; for now we have three:
PATH
,hardlink
all the files that are identical on the filesystem,mysecureshell
container can be used to provide an sftp service with
multiple users (although the files are owned by the same UID
and GID
) using
standalone containers (launched with docker
or podman
) or in an
orchestration system like kubernetes, as we are going to do here.
The image is generated using the following Dockerfile
:
ARG ALPINE_VERSION=3.16.2
FROM alpine:$ALPINE_VERSION as builder
LABEL maintainer="Sergio Talens-Oliag <sto@mixinet.net>"
RUN apk update &&\
apk add --no-cache alpine-sdk git musl-dev &&\
git clone https://github.com/sto/mysecureshell.git &&\
cd mysecureshell &&\
./configure --prefix=/usr --sysconfdir=/etc --mandir=/usr/share/man\
--localstatedir=/var --with-shutfile=/var/lib/misc/sftp.shut --with-debug=2 &&\
make all && make install &&\
rm -rf /var/cache/apk/*
FROM alpine:$ALPINE_VERSION
LABEL maintainer="Sergio Talens-Oliag <sto@mixinet.net>"
COPY --from=builder /usr/bin/mysecureshell /usr/bin/mysecureshell
COPY --from=builder /usr/bin/sftp-* /usr/bin/
RUN apk update &&\
apk add --no-cache openssh shadow pwgen &&\
sed -i -e "s ^.*\(AuthorizedKeysFile\).*$ \1 /etc/ssh/auth_keys/%u "\
/etc/ssh/sshd_config &&\
mkdir /etc/ssh/auth_keys &&\
cat /dev/null > /etc/motd &&\
add-shell '/usr/bin/mysecureshell' &&\
rm -rf /var/cache/apk/*
COPY bin/* /usr/local/bin/
COPY etc/sftp_config /etc/ssh/
COPY entrypoint.sh /
EXPOSE 22
VOLUME /sftp
ENTRYPOINT ["/entrypoint.sh"]
CMD ["server"]
/etc/sftp_config
file is used to
configure
the mysecureshell
server to have all the user homes under /sftp/data
, only
allow them to see the files under their home directories as if it were at the
root of the server and close idle connections after 5m
of inactivity:
The entrypoint.sh
script is the one responsible to prepare the container for
the users included on the /secrets/user_pass.txt
file (creates the users with
their HOME
directories under /sftp/data
and a /bin/false
shell and
creates the key files from /secrets/user_keys.txt
if available).
The script expects a couple of environment variables:
SFTP_UID
: UID
used to run the daemon and for all the files, it has to be
different than 0
(all the files managed by this daemon are going to be
owned by the same user and group, even if the remote users are different).SFTP_GID
: GID
used to run the daemon and for all the files, it has to be
different than 0
.SSH_PORT
and SSH_PARAMS
values if present.
It also requires the following files (they can be mounted as secrets in
kubernetes):
/secrets/host_keys.txt
: Text file containing the ssh server keys in mime
format; the file is processed using the reformime
utility (the one included
on busybox) and can be generated using the
gen-host-keys
script included on the container (it uses ssh-keygen
and
makemime
)./secrets/user_pass.txt
: Text file containing lines of the form
username:password_in_clear_text
(only the users included on this file are
available on the sftp
server, in fact in our deployment we use only the
scs
user for everything)./secrets/user_keys.txt
: Text file that contains lines of the form
username:public_ssh_ed25519_or_rsa_key
; the public keys are installed on
the server and can be used to log into the sftp
server if the username
exists on the user_pass.txt
file.entrypoint.sh
script are:
The container also includes a couple of auxiliary scripts, the first one can be
used to generate the host_keys.txt
file as follows:
$ docker run --rm stodh/mysecureshell gen-host-keys > host_keys.txt
.tar
file that contains auth data
for the list of usernames passed to it (the file contains a user_pass.txt
file with random passwords for the users, public and private ssh keys for them
and the user_keys.txt
file that matches the generated keys).
To generate a tar
file for the user scs
we can execute the following:
$ docker run --rm stodh/mysecureshell gen-users-tar scs > /tmp/scs-users.tar
user_pass.txt
file we can do:
$ tar tvf /tmp/scs-users.tar
-rw-r--r-- root/root 21 2022-09-11 15:55 user_pass.txt
-rw-r--r-- root/root 822 2022-09-11 15:55 user_keys.txt
-rw------- root/root 387 2022-09-11 15:55 id_ed25519-scs
-rw-r--r-- root/root 85 2022-09-11 15:55 id_ed25519-scs.pub
-rw------- root/root 3357 2022-09-11 15:55 id_rsa-scs
-rw------- root/root 3243 2022-09-11 15:55 id_rsa-scs.pem
-rw-r--r-- root/root 729 2022-09-11 15:55 id_rsa-scs.pub
$ tar xfO /tmp/scs-users.tar user_pass.txt
scs:20JertRSX2Eaar4x
nginx-scs
container is generated using the following Dockerfile
:
ARG NGINX_VERSION=1.23.1
FROM nginx:$NGINX_VERSION
LABEL maintainer="Sergio Talens-Oliag <sto@mixinet.net>"
RUN rm -f /docker-entrypoint.d/*
COPY docker-entrypoint.d/* /docker-entrypoint.d/
docker-entrypoint.d
scripts from the
standard image and adding a new one that configures the web server as we want
using a couple of environment variables:
AUTH_REQUEST_URI
: URL to use for the auth_request
, if the variable is not
found on the environment auth_request
is not used.HTML_ROOT
: Base directory of the web server, if not passed the default
/usr/share/nginx/html
is used.nginx
image.
The contents of the configuration script are:
As we will see later the idea is to use the /sftp/data
or /sftp/data/scs
folder as the root of the web published by this container and create an
Ingress
object to provide access to it outside of our kubernetes cluster.webhook-scs
container is generated using the following Dockerfile
:
ARG ALPINE_VERSION=3.16.2
ARG GOLANG_VERSION=alpine3.16
FROM golang:$GOLANG_VERSION AS builder
LABEL maintainer="Sergio Talens-Oliag <sto@mixinet.net>"
ENV WEBHOOK_VERSION 2.8.0
ENV WEBHOOK_PR 549
ENV S3FS_VERSION v1.91
WORKDIR /go/src/github.com/adnanh/webhook
RUN apk update &&\
apk add --no-cache -t build-deps curl libc-dev gcc libgcc patch
RUN curl -L --silent -o webhook.tar.gz\
https://github.com/adnanh/webhook/archive/$ WEBHOOK_VERSION .tar.gz &&\
tar xzf webhook.tar.gz --strip 1 &&\
curl -L --silent -o $ WEBHOOK_PR .patch\
https://patch-diff.githubusercontent.com/raw/adnanh/webhook/pull/$ WEBHOOK_PR .patch &&\
patch -p1 < $ WEBHOOK_PR .patch &&\
go get -d && \
go build -o /usr/local/bin/webhook
WORKDIR /src/s3fs-fuse
RUN apk update &&\
apk add ca-certificates build-base alpine-sdk libcurl automake autoconf\
libxml2-dev libressl-dev mailcap fuse-dev curl-dev
RUN curl -L --silent -o s3fs.tar.gz\
https://github.com/s3fs-fuse/s3fs-fuse/archive/refs/tags/$S3FS_VERSION.tar.gz &&\
tar xzf s3fs.tar.gz --strip 1 &&\
./autogen.sh &&\
./configure --prefix=/usr/local &&\
make -j && \
make install
FROM alpine:$ALPINE_VERSION
LABEL maintainer="Sergio Talens-Oliag <sto@mixinet.net>"
WORKDIR /webhook
RUN apk update &&\
apk add --no-cache ca-certificates mailcap fuse libxml2 libcurl libgcc\
libstdc++ rsync util-linux-misc &&\
rm -rf /var/cache/apk/*
COPY --from=builder /usr/local/bin/webhook /usr/local/bin/webhook
COPY --from=builder /usr/local/bin/s3fs /usr/local/bin/s3fs
COPY entrypoint.sh /
COPY hooks/* ./hooks/
EXPOSE 9000
ENTRYPOINT ["/entrypoint.sh"]
CMD ["server"]
PATCH
included on this
pull request against a released
version of the source instead of creating a fork.
The entrypoint.sh
script is used to generate the webhook
configuration file
for the existing hooks
using environment variables (basically the
WEBHOOK_WORKDIR
and the *_TOKEN
variables) and launch the webhook
service:
The entrypoint.sh
script generates the configuration file for the webhook
server calling functions that print a yaml
section for each hook
and
optionally adds rules to validate access to them comparing the value of a
X-Webhook-Token
header against predefined values.
The expected token values are taken from environment variables, we can define
a token variable for each hook
(DU_TOKEN
, HARDLINK_TOKEN
or S3_TOKEN
)
and a fallback value (COMMON_TOKEN
); if no token variable is defined for a
hook
no check is done and everybody can call it.
The Hook
Definition documentation explains the options you can use for each hook
, the
ones we have right now do the following:
du
: runs on the $WORKDIR
directory, passes as first argument to the
script the value of the path
query parameter and sets the variable
OUTPUT_FORMAT
to the fixed value json
(we use that to print the output of
the script in JSON format instead of text).hardlink
: runs on the $WORKDIR
directory and takes no parameters.s3sync
: runs on the $WORKDIR
directory and sets a lot of environment
variables from values read from the JSON encoded payload sent by the caller
(all the values must be sent by the caller even if they are assigned an empty
value, if they are missing the hook
fails without calling the script); we
also set the stream-command-output
value to true
to make the script show
its output as it is working (we patched the webhook
source to be able to
use this option).du
hook scriptThe du
hook script code checks if the argument passed is a directory,
computes its size using the du
command and prints the results in text format
or as a JSON dictionary:
hardlink
hook scriptThe hardlink
hook script is really simple, it just runs the
util-linux version of the
hardlink
command on its working directory:
We use that to reduce the size of the stored content; to manage versions of
files and folders we keep each version on a separate directory and when one or
more files are not changed this script makes them hardlinks to the same file on
disc, reducing the space used on disk.s3sync
hook scriptThe s3sync
hook script uses the s3fs
tool to mount a bucket and synchronise data between a folder inside the bucket
and a directory on the filesystem using rsync
; all values needed to execute
the task are taken from environment variables:
StatefulSet
with one replica.
Our production deployment is done on AWS and to be
able to scale we use EFS for our
PersistenVolume
; the idea is that the volume has no size limit, its
AccessMode
can be set to ReadWriteMany
and we can mount it from multiple
instances of the Pod without issues, even if they are in different availability
zones.
For development we use k3d and we are also able to scale the
StatefulSet
for testing because we use a ReadWriteOnce
PVC, but it points
to a hostPath
that is backed up by a folder that is mounted on all the
compute nodes, so in reality Pods in different k3d
nodes use the same folder
on the host.
mysecureshell
container that
can be generated using kubernetes pods as follows (we are only creating the
scs
user):
$ kubectl run "mysecureshell" --restart='Never' --quiet --rm --stdin \
--image "stodh/mysecureshell:latest" -- gen-host-keys >"./host_keys.txt"
$ kubectl run "mysecureshell" --restart='Never' --quiet --rm --stdin \
--image "stodh/mysecureshell:latest" -- gen-users-tar scs >"./users.tar"
secrets.yaml
file as follows:
$ tar xf ./users.tar user_keys.txt user_pass.txt
$ kubectl --dry-run=client -o yaml create secret generic "scs-secret" \
--from-file="host_keys.txt=host_keys.txt" \
--from-file="user_keys.txt=user_keys.txt" \
--from-file="user_pass.txt=user_pass.txt" > ./secrets.yaml
secrets.yaml
will look like the following file (the base64
would match the content of the files, of course):
statefulSet
) can be as simple as this:
On this definition we don t set the storageClassName
to use the default one.
PersistentVolume
as
required by the
Local
Persistence Volume Static Provisioner (note that the /volumes/scs-pv
has to
be created by hand, in our k3d
system we mount the same host directory on the
/volumes
path of all the nodes and create the scs-pv
directory by hand
before deploying the persistent volume):
And to make sure that everything works as expected we update the PVC definition
to add the right storageClassName
:
PersistentVolume
(we are
using the
aws-efs-csi-driver which
supports Dynamic Provisioning) but we add the storageClassName
(we set it
to the one mapped to the EFS
driver, i.e. efs-sc
) and set ReadWriteMany
as the accessMode
:
statefulSet
is as follows:
Notes about the containers:
nginx
: As this is an example the web server is not using an
AUTH_REQUEST_URI
and uses the /sftp/data
directory as the root of the web
(to get to the files uploaded for the scs
user we will need to use /scs/
as a prefix on the URLs).mysecureshell
: We are adding the IPC_OWNER
capability to the container to
be able to use some of the sftp-*
commands inside it, but they are
not really needed, so adding the capability is optional.webhook
: We are launching this container in privileged mode to be able to
use the s3fs-fuse
, as it will not work otherwise for now (see this
kubernetes issue); if
the functionality is not needed the container can be executed with regular
privileges; besides, as we are not enabling public access to this service we
don t define *_TOKEN
variables (if required the values should be read from a
Secret
object).devfuse
volume is only needed if we plan to use the s3fs
command on
the webhook
container, if not we can remove the volume definition and its
mounts.Service
object:
scs
files from the outside we can add an ingress object like
the following (the definition is for testing using the localhost
name):
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: scs-ingress
labels:
app.kubernetes.io/name: scs
spec:
ingressClassName: nginx
rules:
- host: 'localhost'
http:
paths:
- path: /scs
pathType: Prefix
backend:
service:
name: scs-svc
port:
number: 80
statefulSet
we create a namespace and apply the object
definitions shown before:
$ kubectl create namespace scs-demo
namespace/scs-demo created
$ kubectl -n scs-demo apply -f secrets.yaml
secret/scs-secrets created
$ kubectl -n scs-demo apply -f pvc.yaml
persistentvolumeclaim/scs-pvc created
$ kubectl -n scs-demo apply -f statefulset.yaml
statefulset.apps/scs created
$ kubectl -n scs-demo apply -f service.yaml
service/scs-svc created
$ kubectl -n scs-demo apply -f ingress.yaml
ingress.networking.k8s.io/scs-ingress created
kubectl
:
$ kubectl -n scs-demo get all,secrets,ingress
NAME READY STATUS RESTARTS AGE
pod/scs-0 3/3 Running 0 24s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/scs-svc ClusterIP 10.43.0.47 <none> 22/TCP,80/TCP,9000/TCP 21s
NAME READY AGE
statefulset.apps/scs 1/1 24s
NAME TYPE DATA AGE
secret/default-token-mwcd7 kubernetes.io/service-account-token 3 53s
secret/scs-secrets Opaque 3 39s
NAME CLASS HOSTS ADDRESS PORTS AGE
ingress.networking.k8s.io/scs-ingress nginx localhost 172.21.0.5 80 17s
sftp
server from
other Pods, but to test the system we are going to do a kubectl port-forward
and connect to the server using our host client and the password we have
generated (it is on the user_pass.txt
file, inside the users.tar
archive):
$ kubectl -n scs-demo port-forward service/scs-svc 2020:22 &
Forwarding from 127.0.0.1:2020 -> 22
Forwarding from [::1]:2020 -> 22
$ PF_PID=$!
$ sftp -P 2020 scs@127.0.0.1 1
Handling connection for 2020
The authenticity of host '[127.0.0.1]:2020 ([127.0.0.1]:2020)' can't be \
established.
ED25519 key fingerprint is SHA256:eHNwCnyLcSSuVXXiLKeGraw0FT/4Bb/yjfqTstt+088.
This key is not known by any other names
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
Warning: Permanently added '[127.0.0.1]:2020' (ED25519) to the list of known \
hosts.
scs@127.0.0.1's password: **********
Connected to 127.0.0.1.
sftp> ls -la
drwxr-xr-x 2 sftp sftp 4096 Sep 25 14:47 .
dr-xr-xr-x 3 sftp sftp 4096 Sep 25 14:36 ..
sftp> !date -R > /tmp/date.txt 2
sftp> put /tmp/date.txt .
Uploading /tmp/date.txt to /date.txt
date.txt 100% 32 27.8KB/s 00:00
sftp> ls -l
-rw-r--r-- 1 sftp sftp 32 Sep 25 15:21 date.txt
sftp> ln date.txt date.txt.1 3
sftp> ls -l
-rw-r--r-- 2 sftp sftp 32 Sep 25 15:21 date.txt
-rw-r--r-- 2 sftp sftp 32 Sep 25 15:21 date.txt.1
sftp> put /tmp/date.txt date.txt.2 4
Uploading /tmp/date.txt to /date.txt.2
date.txt 100% 32 27.8KB/s 00:00
sftp> ls -l 5
-rw-r--r-- 2 sftp sftp 32 Sep 25 15:21 date.txt
-rw-r--r-- 2 sftp sftp 32 Sep 25 15:21 date.txt.1
-rw-r--r-- 1 sftp sftp 32 Sep 25 15:21 date.txt.2
sftp> exit
$ kill "$PF_PID"
[1] + terminated kubectl -n scs-demo port-forward service/scs-svc 2020:22
sftp
service on the forwarded port with the scs
user.date.txt
file from the
URL http://localhost/scs/date.txt:
$ curl -s http://localhost/scs/date.txt
Sun, 25 Sep 2022 17:21:51 +0200
hooks
directly,
from a CronJob
and from a Job
.
du
)In our deployment the direct calls are done from other Pods, to simulate it we
are going to do a port-forward
and call the script with an existing PATH (the
root directory) and a bad one:
$ kubectl -n scs-demo port-forward service/scs-svc 9000:9000 >/dev/null &
$ PF_PID=$!
$ JSON="$(curl -s "http://localhost:9000/hooks/du?path=.")"
$ echo $JSON
"path":"","bytes":"4160"
$ JSON="$(curl -s "http://localhost:9000/hooks/du?path=foo")"
$ echo $JSON
"error":"The provided PATH ('foo') is not a directory"
$ kill $PF_PID
.
PATH and the output is in json
format because we export OUTPUT_FORMAT
with
the value json
on the webhook
configuration.hardlink
)As explained before, the webhook
container can be used to run cronjobs
; the
following one uses an alpine
container to call the hardlink
script each
minute (that setup is for testing, obviously):
The following console session shows how we create the object, allow a couple of
executions and remove it (in production we keep it running but once a day, not
each minute):
$ kubectl -n scs-demo apply -f webhook-cronjob.yaml 1
cronjob.batch/hardlink created
$ kubectl -n scs-demo get pods -l "cronjob=hardlink" -w 2
NAME READY STATUS RESTARTS AGE
hardlink-27735351-zvpnb 0/1 Pending 0 0s
hardlink-27735351-zvpnb 0/1 ContainerCreating 0 0s
hardlink-27735351-zvpnb 0/1 Completed 0 2s
^C
$ kubectl -n scs-demo logs pod/hardlink-27735351-zvpnb 3
Mode: real
Method: sha256
Files: 3
Linked: 1 files
Compared: 0 xattrs
Compared: 1 files
Saved: 32 B
Duration: 0.000220 seconds
$ sleep 60
$ kubectl -n scs-demo get pods -l "cronjob=hardlink" 4
NAME READY STATUS RESTARTS AGE
hardlink-27735351-zvpnb 0/1 Completed 0 83s
hardlink-27735352-br5rn 0/1 Completed 0 23s
$ kubectl -n scs-demo logs pod/hardlink-27735352-br5rn 5
Mode: real
Method: sha256
Files: 3
Linked: 0 files
Compared: 0 xattrs
Compared: 0 files
Saved: 0 B
Duration: 0.000070 seconds
$ kubectl -n scs-demo delete -f webhook-cronjob.yaml 6
cronjob.batch "hardlink" deleted
cronjob
label, we interrupt it once we see
that the first run has been completed.date.txt.2
has been replaced by a hardlink (the
summary does not name the file, but it is the only option knowing the
contents from the original upload).s3sync
)The following job can be used to synchronise the contents of a directory in a
S3 bucket with the SCS Filesystem:
The file with parameters for the script must be something like this:
Once we have both files we can run the Job as follows:
$ kubectl -n scs-demo create secret generic webhook-job-secrets \ 1
--from-file="s3sync.json=s3sync.json"
secret/webhook-job-secrets created
$ kubectl -n scs-demo apply -f webhook-job.yaml 2
job.batch/s3sync created
$ kubectl -n scs-demo get pods -l "cronjob=s3sync" 3
NAME READY STATUS RESTARTS AGE
s3sync-zx2cj 0/1 Completed 0 12s
$ kubectl -n scs-demo logs s3sync-zx2cj 4
Mounted bucket 's3fs-test' on '/root/tmp.jiOjaF/s3data'
sending incremental file list
created directory ./test
./
kyso.png
Number of files: 2 (reg: 1, dir: 1)
Number of created files: 2 (reg: 1, dir: 1)
Number of deleted files: 0
Number of regular files transferred: 1
Total file size: 15,075 bytes
Total transferred file size: 15,075 bytes
Literal data: 15,075 bytes
Matched data: 0 bytes
File list size: 0
File list generation time: 0.147 seconds
File list transfer time: 0.000 seconds
Total bytes sent: 15,183
Total bytes received: 74
sent 15,183 bytes received 74 bytes 30,514.00 bytes/sec
total size is 15,075 speedup is 0.99
Called umount for '/root/tmp.jiOjaF/s3data'
Script exit code: 0
$ kubectl -n scs-demo delete -f webhook-job.yaml 5
job.batch "s3sync" deleted
$ kubectl -n scs-demo delete secrets webhook-job-secrets 6
secret "webhook-job-secrets" deleted
webhook-job-secrets
secret that contains the
s3sync.json
file.cronjob=s3sync
we get the Pods executed by the job.Reproducible builds provide additional protection and validation against attempts to compromise build systems. They ensure the binary products of each build system match: i.e., they are built from the same source, regardless of variable metadata such as the order of input files, timestamps, locales, and paths. Reproducible builds are those where re-running the build steps with identical input artifacts results in bit-for-bit identical output. Builds that cannot meet this must provide a justification why the build cannot be made reproducible.The full press release is available online.
This is the core of a two-decade-old debate among security people, and it s one that the benevolent God faction has consistently had the upper hand in. They re the curated computing advocates who insist that preventing you from choosing an alternative app store or side-loading a program is for your own good because if it s possible for you to override the manufacturer s wishes, then malicious software may impersonate you to do so, or you might be tricked into doing so. [..] This benevolent dictatorship model only works so long as the dictator is both perfectly benevolent and perfectly competent. We know the dictators aren t always benevolent. [ ] But even if you trust a dictator s benevolence, you can t trust in their perfection. Everyone makes mistakes. Benevolent dictator computing works well, but fails badly. Designing a computer that intentionally can t be fully controlled by its owner is a nightmare, because that is a computer that, once compromised, can attack its owner with impunity.
essential
and required
package sets became 100% reproducible in Debian bookworm on the amd64
and arm64
architectures. These two subsets of the full Debian archive refer to Debian package priority levels as described in the 2.5 Priorities section of the Debian Policy there is no canonical minimal installation package set in Debian due to its diverse methods of installation.
As it happens, these package sets are not reproducible on the i386
architecture because the ncurses
package on that architecture is not yet reproducible, and the sed
package currently fails to build from source on armhf
too. The full list of reproducible packages within these package sets can be viewed within our QA system, such as on the page of required
packages in amd64
and the list of essential
packages on arm64
, both for Debian bullseye.
podman
on Debian bullseye:
$ sudo apt install podman
$ podman run --rm -it debian:bullseye bash
The (pre-built) image used is itself built using debuerrotype, as explained on docker.debian.net. This page also details how to build the image yourself and what checksums are expected if you do so.
$ SOURCE_DATE_EPOCH=$(date --utc --date=2022-08-29 +%s) mmdebstrap unstable > unstable.tar
This works for (at least) Debian unstable, bullseye and bookworm, and is tested automatically by a number of QA jobs set up by Holger Levsen (unstable, bookworm and bullseye)
$SOURCE_DATE_EPOCH
to be not greater than SOURCE_DATE_EPOCH
.
/etc/machine-id
, /var/cache/ldconfig/aux-cache
, /var/log/dpkg.log
, /var/log/alternatives.log
and /var/log/bootstrap.log
, and for cdebootstrap we also need to delete the /var/log/apt/history.log
and /var/log/apt/term.log
files as well.
/etc/machine-id
file in both debootstrap [ ] and cdebootstrap [ ].
randomness_in_browserify_output
[ ], haskell_abi_hash_differences
[ ], nondeterministic_ids_in_html_output_generated_by_python_sphinx_panels
[ ]. Lastly, Mattia Rizzolo removed the deterministic
flag from the captures_kernel_variant
flag [ ].
The post itself contains a lot more details, including a brief discussion of tooling. Elsewhere in GNU Guix, however, Vagrant updated a number of packages such asIgnoring the pesky unknown packages, it is more like ~93% reproducible
and ~7% unreproducible... that feels a bit better to me!
These numbers wander around over time, mostly due to packages moving
back into an "unknown" state while the build farms catch up with each
other... although the above numbers seem to have been pretty consistent
over the last few days.
itpp
[ ], perl-class-methodmaker
[ ], libnet
[ ], directfb
[ ] and mm-common
[ ], as well as updated the version of reprotest to 0.7.21 [ ].
In openSUSE, Bernhard M. Wiedemann published his usual openSUSE monthly report.
220
and 221
to Debian, as well as made the following changes:
external_tools.py
to reflect changes to xxd
and the vim-common
package. [ ]xxd
package now, not the vim-common
package. [ ]at-spi-sharp
(build failure when build on a multiprocessor machine).borgbackup
(fails to build in 2038, fix)buzztrax
(parallelism-related issue)chart-testing
(date-related issue)memcached
(fails to build in 2038)nim
(fails to build in 2038)perl-Time-Moment
(fails to build in 2038)python-bson
(fails to build in 2038)python-heatclient
(fails to build in 2038)python3.8
(fails to build in 2038)reproducible-faketools
s3fs
(date-related issue)systemd
(date-related issue)wayfire
.multipath-tools
.node-canvas-confetti
.psi
(forwarded upstream).sphinx-panels
(forwarded upstream).sysfsutils
.geeqie
.deb-src
lines to enable test builds a Non-maintainer Upload (NMU) campaign targeting 708 sources without .buildinfo
files found in Debian unstable, including 475 in bookworm. [ ][ ]linux-image-generic
kernel package installed. [ ]SOURCE_DATE_EPOCH
for all our new bootstrap jobs. [ ]/bin/sh
symlink [ ].
#reproducible-builds
on irc.oftc.net
.
rb-general@lists.reproducible-builds.org
This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.
Next.